DSAN 5100
  • Introduction
  • Statistical Analysis
  • Results
    • Overall Health
    • Maternal Health
    • Maternal Wellness
    • Child Health
    • Child Wellness
  • Conclusions
    • Travel Implications
    • Complications & Unsafe Abortion
    • Conclusion
  • Bibliography
  • Appendix

On this page

  • Economic Wellness
    • Parental Leave by Abortion Policy
    • Parental Leave Job Protection by Abortion Policy
    • Percentage of Employed Women per State by Abortion Policy
  • Education Wellness
    • Percentage of Women per State with a BA degree or higher by Abortion Policy
  • Mental Wellness
    • Women with Postpartum Depression by Abortion Policy
    • Women with Depression Before or During Pregnancy by Abortion Policy
    • Percentage of Women Who Received a Postpartum Depression Screening in 2021, by Abortion Policy
    • Percentage of Women Experiencing Intimate Partner Violence by Abortion Policy
  • Travel

Maternal Wellness - Statistical Modeling

Author

Viviana Luccioli

Code
options(repos = c(CRAN = "https://cran.r-project.org"))
install.packages("tidyverse")

The downloaded binary packages are in
    /var/folders/zl/mwx636ts1pbc006v4wqp6fwr0000gn/T//RtmppaDaK2/downloaded_packages
Code
library(tidyverse)
library(readr)
df <- read_csv("../data/clean_data/merged_data.csv", show_col_types = FALSE)
df <- df[!is.na(df$abortion_policies), ]

Economic Wellness

Parental Leave by Abortion Policy

Code
# reorder the levels of abortion policies
df$abortion_policies <- factor(df$abortion_policies, 
                              levels = c("most protective", "very protective", 
                                         "protective", "some restrictions/protections", 
                                         "restrictive", "very restrictive", "most restrictive"))

# create variable combining states that have declared parental leave policy as of now, even if not enacted
df$parental_leave_mandatory <- pmax(df$parental_leave_mandatory_enacted, df$parental_leave_mandatory_not_yet_enacted)

# stacked bar plot
ggplot(df, aes(x = as.factor(parental_leave_mandatory), fill = abortion_policies)) +
  geom_bar(position = "fill") +
  labs(x = "Parental leave policy", 
       y = "Proportion", 
       fill = "Level of state abortion policy",
       title = "State Abortion Policies by Parental Leave Requirements") +
  scale_x_discrete(labels = c("0" = "Not Mandated", "1" = "Mandated")) +
  scale_fill_manual(values = c("#1c7416", "#68bb59", "#acdf87", "#fab733", "#ff6242", "#ff0000", "#c61a09")) +
  theme_classic()

This plot shows us that while the vast majority of states that have not enacted mandatory parental leave have more restrictive abortion policies, the vast majority of states that do mandate parental leave have protective abortion policies. So, the states that want to prohibit women from accessing contraceptive care also enable the worst conditions for a woman when she has a baby: an unfavorable and unprotected employment situation.

Code
# fisher's exact test
contingency_table <- table(df$parental_leave_mandatory, df$abortion_policies)

fisher_test_result <- fisher.test(contingency_table)
print(fisher_test_result)

    Fisher's Exact Test for Count Data

data:  contingency_table
p-value = 6.816e-05
alternative hypothesis: two.sided

The p-value produced from Fisher’s Exact Test indicates that we have significant evidence to suggest that abortion policy and mandatory parental leave policy are not independent of each other. This suggests that levels of abortion_policies are strongly related to whether parental leave is mandatory.

Now we will perform logistic regression to investigate the direction of this relationship:

Code
# check assumptions 

# sample size 
table(df$parental_leave_mandatory, df$abortion_policies)
   
    most protective very protective protective some restrictions/protections
  0               1               2          4                             3
  1               1               5          6                             2
   
    restrictive very restrictive most restrictive
  0           7                4               16
  1           0                0                0
Code
# the sample sizes are too small, so we must use exact logistic regression to account for this size and separation
Code
# order the levels of abortion policy
df$abortion_policies <- factor(df$abortion_policies, 
                               levels = c("most restrictive", 
                                          "very restrictive", 
                                          "restrictive", 
                                          "some restrictions/protections", 
                                          "protective", 
                                          "very protective", 
                                          "most protective"))

library(logistf)
Warning in check_dep_version(): ABI version mismatch: 
lme4 was built with Matrix ABI version 1
Current Matrix ABI version is 0
Please re-install lme4 from source or restore original 'Matrix' package
Code
# Fit the exact logistic regression model
fit <- logistf(parental_leave_mandatory ~ factor(abortion_policies), data = df)

# View the results
summary(fit)
logistf(formula = parental_leave_mandatory ~ factor(abortion_policies), 
    data = df)

Model fitted by Penalized ML
Coefficients:
                                                             coef se(coef)
(Intercept)                                            -3.4965076 1.435481
factor(abortion_policies)very restrictive               1.2992830 2.069500
factor(abortion_policies)restrictive                    0.7884574 2.047911
factor(abortion_policies)some restrictions/protections  3.1600353 1.657203
factor(abortion_policies)protective                     3.8642323 1.560985
factor(abortion_policies)very protective                4.2849649 1.625554
factor(abortion_policies)most protective                3.4965076 1.842265
                                                       lower 0.95 upper 0.95
(Intercept)                                            -8.3448894  -1.499056
factor(abortion_policies)very restrictive              -3.9858138   6.588116
factor(abortion_policies)restrictive                   -4.4758771   6.053943
factor(abortion_policies)some restrictions/protections  0.4062025   8.175971
factor(abortion_policies)protective                     1.4626807   8.807042
factor(abortion_policies)very protective                1.7307768   9.283523
factor(abortion_policies)most protective                0.2091242   8.681275
                                                            Chisq            p
(Intercept)                                            19.0554958 1.269716e-05
factor(abortion_policies)very restrictive               0.3735875 5.410552e-01
factor(abortion_policies)restrictive                    0.1450325 7.033284e-01
factor(abortion_policies)some restrictions/protections  5.1499418 2.324715e-02
factor(abortion_policies)protective                    12.0956887 5.053855e-04
factor(abortion_policies)very protective               13.1051061 2.944920e-04
factor(abortion_policies)most protective                4.3329275 3.738190e-02
                                                       method
(Intercept)                                                 2
factor(abortion_policies)very restrictive                   2
factor(abortion_policies)restrictive                        2
factor(abortion_policies)some restrictions/protections      2
factor(abortion_policies)protective                         2
factor(abortion_policies)very protective                    2
factor(abortion_policies)most protective                    2

Method: 1-Wald, 2-Profile penalized log-likelihood, 3-None

Likelihood ratio test=22.39574 on 6 df, p=0.0010263, n=51
Wald test = 13.13625 on 6 df, p = 0.04092253
Code
# Generate predicted probabilities
df$predicted_prob <- predict(fit, type = "response")

# visualize
ggplot(df, aes(x = abortion_policies, y = predicted_prob)) +
  geom_point(position = position_jitter(width = 0.2), alpha = 0.6) +
  stat_summary(fun = mean, geom = "point", color = "blue", size = 3) +
  labs(x = "Abortion Policies", 
       y = "Predicted Probability",
       title = "Mandatory Parental Leave Policy \nPredicted Probabilities by Abortion Policy") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

The exact logistic regression model yields a p-value of 0.0010263, revealing that there is sufficient evidence to suggest that there is a sloped relationship between whether or not a state has mandatory abortion policy and the degree of their abortion policy restrictiveness/protectiveness. The slope coefficients and the visualization indicate that the slope between the abortion policy level and having parental leave legislation (y=1) gets more positive as the policy level gets more protective regarding abortion. This indicates that states that have abortion policies that are protective for women are more likely to also have mandatory parental leave policy than states with restrictive abortion policy. This is important, because it means that the states that choose to make abortion less accessible and therefore force women to have babies, also foster conditions that make it incredibly difficult for women to have babies. This highlights a lack of care for mothers’ wellbeing and financial stability in states with restrictive abortion rights.

Parental Leave Job Protection by Abortion Policy

Code
# reorder the levels of abortion policies
df$abortion_policies <- factor(df$abortion_policies, 
                              levels = c("most protective", "very protective", 
                                         "protective", "some restrictions/protections", 
                                         "restrictive", "very restrictive", "most restrictive"))

# stacked bar plot
ggplot(df, aes(x = as.factor(parental_leave_job_protection), fill = abortion_policies)) +
  geom_bar(position = "fill") +
  labs(x = "Parental leave job protection policy", 
       y = "Proportion", 
       fill = "Level of state abortion policy",
       title = "State Abortion Policies by Job Protection Requirements") +
  scale_x_discrete(labels = c("0" = "No Job Protection", "1" = "Job Protection")) +
  scale_fill_manual(values = c("#1c7416", "#68bb59", "#acdf87", "#fab733", "#ff6242", "#ff0000", "#c61a09")) +
  theme_classic() +
  theme(plot.title = element_text(hjust = 0.5))

Code
# fisher's exact test
contingency_table <- table(df$parental_leave_job_protection, df$abortion_policies)

fisher_test_result <- fisher.test(contingency_table)
print(fisher_test_result)

    Fisher's Exact Test for Count Data

data:  contingency_table
p-value = 0.006869
alternative hypothesis: two.sided

Fisher’s Exact Test yields a p-value of 0.006869, which indicates that we have sufficient evidence to suggest that there is an association between a state’s abortion policies and whether or not they have policy for job protection when a parent takes parental leave.

Code
# check assumptions: 
# tabulate categories to see if sample sizes are sufficient
table(df$parental_leave_job_protection, df$abortion_policies)
   
    most protective very protective protective some restrictions/protections
  0               1               4          6                             3
  1               1               3          4                             2
   
    restrictive very restrictive most restrictive
  0           7                4               16
  1           0                0                0
Code
# since the sample sizes are insufficient, we must use exact logistic regression 
Code
# order the levels of abortion policy
df$abortion_policies <- factor(df$abortion_policies, 
                               levels = c("most restrictive", 
                                          "very restrictive", 
                                          "restrictive", 
                                          "some restrictions/protections", 
                                          "protective", 
                                          "very protective", 
                                          "most protective"))

library(logistf)

# Fit the exact logistic regression model
fit <- logistf(parental_leave_job_protection ~ factor(abortion_policies), data = df)

# View the results
summary(fit)
logistf(formula = parental_leave_job_protection ~ factor(abortion_policies), 
    data = df)

Model fitted by Penalized ML
Coefficients:
                                                             coef se(coef)
(Intercept)                                            -3.4965076 1.435481
factor(abortion_policies)very restrictive               1.2992830 2.069500
factor(abortion_policies)restrictive                    0.7884574 2.047911
factor(abortion_policies)some restrictions/protections  3.1600353 1.657203
factor(abortion_policies)protective                     3.1287828 1.560985
factor(abortion_policies)very protective                3.2451931 1.602667
factor(abortion_policies)most protective                3.4965076 1.842265
                                                       lower 0.95 upper 0.95
(Intercept)                                            -8.3448894  -1.499056
factor(abortion_policies)very restrictive              -3.9858138   6.588116
factor(abortion_policies)restrictive                   -4.4758771   6.053943
factor(abortion_policies)some restrictions/protections  0.4062025   8.175971
factor(abortion_policies)protective                     0.6945336   8.069477
factor(abortion_policies)very protective                0.6799392   8.218484
factor(abortion_policies)most protective                0.2091242   8.681275
                                                            Chisq            p
(Intercept)                                            19.0554958 1.269716e-05
factor(abortion_policies)very restrictive               0.3735875 5.410552e-01
factor(abortion_policies)restrictive                    0.1450325 7.033284e-01
factor(abortion_policies)some restrictions/protections  5.1499418 2.324715e-02
factor(abortion_policies)protective                     6.8812601 8.710413e-03
factor(abortion_policies)very protective                6.5069583 1.074532e-02
factor(abortion_policies)most protective                4.3329275 3.738190e-02
                                                       method
(Intercept)                                                 2
factor(abortion_policies)very restrictive                   2
factor(abortion_policies)restrictive                        2
factor(abortion_policies)some restrictions/protections      2
factor(abortion_policies)protective                         2
factor(abortion_policies)very protective                    2
factor(abortion_policies)most protective                    2

Method: 1-Wald, 2-Profile penalized log-likelihood, 3-None

Likelihood ratio test=13.27942 on 6 df, p=0.03880713, n=51
Wald test = 12.19211 on 6 df, p = 0.05781772
Code
# Generate predicted probabilities
df$predicted_prob <- predict(fit, type = "response")

# visualize
ggplot(df, aes(x = abortion_policies, y = predicted_prob)) +
  geom_point(position = position_jitter(width = 0.2), alpha = 0.6) +
  stat_summary(fun = mean, geom = "point", color = "blue", size = 3) +
  labs(x = "Abortion Policies", 
       y = "Predicted Probability",
       title = "Parental Leave Job Protection \nPredicted Probabilities by Abortion Policy") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Due to the sample sizes that were too small, we performed exact logistic regression to account for small sample sizes and separation. The exact logistic regression yielded a p-value of 0.03880713, indicating that there is significant evidence to suggest a linear relationship between job protection policy and the degree of restrictiveness to protectiveness of a state’s abortion policy.

From the coefficients and the visualization, we can see that states with more protective abortion measures are more likely to have policy guaranteeing job protection for parents when they take paternal leave to care for their child(ren). This reveals that states that do not have abortion policy concurrently establish conditions that make having a child more difficult. This is problematic, because if the state creates conditions so that women are forced to have babies instead of being able to access abortion or contraceptive rights and then they have a baby, they face the jeopardy of losing their job and income with no protection.

Percentage of Employed Women per State by Abortion Policy

Code
# Kruskal-Wallis test
kruskal.test(percent_women_working ~ abortion_policies, data = df)

    Kruskal-Wallis rank sum test

data:  percent_women_working by abortion_policies
Kruskal-Wallis chi-squared = 13.889, df = 6, p-value = 0.0309
Code
# If Kruskal-Wallis is significant, run pairwise Wilcoxon tests
pairwise.wilcox.test(df$percent_women_working, 
                     df$abortion_policies,
                     p.adjust.method = "bonferroni")

    Pairwise comparisons using Wilcoxon rank sum test with continuity correction 

data:  df$percent_women_working and df$abortion_policies 

                              most restrictive very restrictive restrictive
very restrictive              1.00             -                -          
restrictive                   0.31             1.00             -          
some restrictions/protections 1.00             1.00             1.00       
protective                    0.13             1.00             1.00       
very protective               1.00             1.00             1.00       
most protective               1.00             1.00             1.00       
                              some restrictions/protections protective
very restrictive              -                             -         
restrictive                   -                             -         
some restrictions/protections -                             -         
protective                    1.00                          -         
very protective               1.00                          1.00      
most protective               1.00                          1.00      
                              very protective
very restrictive              -              
restrictive                   -              
some restrictions/protections -              
protective                    -              
very protective               -              
most protective               1.00           

P value adjustment method: bonferroni 
Code
df$abortion_policies <- factor(df$abortion_policies, 
                              levels = c("most protective", "very protective", 
                                         "protective", "some restrictions/protections", 
                                         "restrictive", "very restrictive", "most restrictive"))

ggplot(df, aes(x = abortion_policies, y = percent_women_working, fill=abortion_policies)) +
  geom_boxplot() +
  theme_minimal() +
  theme(axis.text.x = element_text(size = 8, angle = 45, hjust = 1),
        axis.title.y = element_text(size = 10)) + 
  scale_fill_manual(values = c("#1c7416", "#68bb59", "#acdf87", "#fab733", "#ff6242", "#ff0000", "#c61a09")) +
  labs(x = "Abortion Policies", 
       y = "% Women Employed",
       title = "Percentage of Women Employed by Abortion Policy") +
  theme(plot.title = element_text(size = 10, hjust = 0.5))

The Kruskal-Wallis test yields a significant p-value of 0.0309, allowing us to suggest that there is truly a difference in the mean number of women working in at least one level of abortion policy. However, looking at the accompanying visualization, we cannot suggest that the percentage of employed women consistently increases with more protective abortion policies, because the range and mean of employed women percentages in states with “very protective” abortion policy appear to be quite low; the second lowest in the entire plot after states with the most restrictive policies. This is interesting, because only this group within levels of protective abortion policies stands out as having lower female employment rates.

Code
options(repos = c(CRAN = "https://cloud.r-project.org"))
install.packages("clinfun")

The downloaded binary packages are in
    /var/folders/zl/mwx636ts1pbc006v4wqp6fwr0000gn/T//RtmppaDaK2/downloaded_packages
Code
library(clinfun)

jt_test <- jonckheere.test(df$percent_women_working, 
                          as.numeric(df$abortion_policies), 
                          alternative="decreasing",
                          nperm = 10000)
print(jt_test)

    Jonckheere-Terpstra test

data:  
JT = 395.5, p-value = 0.0135
alternative hypothesis: decreasing
Code
# Permutation test
n_perm <- 10000 

# observed F-statistic
observed_anova <- aov(percent_women_working ~ abortion_policies, data = df)
observed_F <- summary(observed_anova)[[1]][["F value"]][1]

set.seed(6547) 
perm_F <- numeric(n_perm)

for (i in 1:n_perm) {
  # permute the response variable
  permuted_data <- df
  permuted_data$percent_women_working <- sample(permuted_data$percent_women_working)
  
  # ANOVA on permuted data
  perm_anova <- aov(percent_women_working ~ abortion_policies, data = permuted_data)
  perm_F[i] <- summary(perm_anova)[[1]][["F value"]][1]
}

# p-value
p_value <- mean(perm_F >= observed_F)

cat("Observed F-statistic:", observed_F, "\n")
Observed F-statistic: 2.588987 
Code
cat("Permutation Test p-value:", p_value, "\n")
Permutation Test p-value: 0.0296 
Code
# visualization
hist(perm_F, breaks = 30, main = "Permutation Distribution of F-statistic", xlab = "F-statistic")
abline(v = observed_F, col = "red", lwd = 2, lty = 2)

The permutation test of the ANOVA test further confirms that this data has a significant p-value which indicates evidence to suggest that the mean percentage of employed women in at least one abortion policy group is different from the rest, but as noted above with the visualization, it does not necessarily mean that there is a “linear” relationship between more protective abortion policies and higher female employment.

Code
df$abortion_policies <- factor(df$abortion_policies, levels = c("most protective", "very protective", "protective", "some restrictions/protections", "restrictive", "very restrictive", "most restrictive"))

ggplot(df, aes(x = abortion_policies, y = percent_women_working, fill = abortion_policies)) + stat_summary(fun = mean, geom = "bar", position = "dodge") + theme_minimal() + scale_fill_manual(values = c("#1c7416", "#68bb59", "#acdf87", "#fab733", "#ff6242", "#ff0000", "#c61a09")) + theme(axis.text.x = element_text(size = 8, angle = 45, hjust = 1), plot.title = element_text(size = 10, hjust = 0.5)) + labs(x = "Abortion Policies", y = "Mean % Women Working", title = "Mean Percentage of Women Working by Abortion Policy")

Education Wellness

Percentage of Women per State with a BA degree or higher by Abortion Policy

Code
# Kruskal-Wallis test
kruskal.test(percentage_women_BA_or_higher ~ abortion_policies, data = df)

    Kruskal-Wallis rank sum test

data:  percentage_women_BA_or_higher by abortion_policies
Kruskal-Wallis chi-squared = 19.526, df = 6, p-value = 0.003362
Code
# If Kruskal-Wallis is significant, run pairwise Wilcoxon tests
pairwise.wilcox.test(df$percentage_women_BA_or_higher, 
                     df$abortion_policies,
                     p.adjust.method = "bonferroni")

    Pairwise comparisons using Wilcoxon rank sum test with continuity correction 

data:  df$percentage_women_BA_or_higher and df$abortion_policies 

                              most protective very protective protective
very protective               1.0000          -               -         
protective                    1.0000          1.0000          -         
some restrictions/protections 1.0000          1.0000          1.0000    
restrictive                   1.0000          1.0000          1.0000    
very restrictive              1.0000          1.0000          1.0000    
most restrictive              1.0000          0.5285          0.0128    
                              some restrictions/protections restrictive
very protective               -                             -          
protective                    -                             -          
some restrictions/protections -                             -          
restrictive                   1.0000                        -          
very restrictive              1.0000                        1.0000     
most restrictive              1.0000                        0.0077     
                              very restrictive
very protective               -               
protective                    -               
some restrictions/protections -               
restrictive                   -               
very restrictive              -               
most restrictive              1.0000          

P value adjustment method: bonferroni 
Code
ggplot(df, aes(x = abortion_policies, y = percentage_women_BA_or_higher, fill=abortion_policies)) +
  geom_boxplot() +
  theme_minimal() +
  theme(axis.text.x = element_text(size = 8, angle = 45, hjust = 1),
        axis.title.y = element_text(size = 10)) + 
  scale_fill_manual(values = c("#1c7416", "#68bb59", "#acdf87", "#fab733", "#ff6242", "#ff0000", "#c61a09")) +
  labs(x = "Abortion Policies", 
       y = "% Women with a Bachelors degree or higher",
       title = "Percentage of Women with a Bachelors Degree or Higher by Abortion Policy") +
  theme(plot.title = element_text(size = 10, hjust = 0.5)) 

The p-value of 0.003362 indicates that we have significant evidence to reject the null hypothesis and suggest that there is a true different in the mean percentage of women with a college degree (BA) or higher in different states depending on their abortion policy. The boxplot visualization clearly indicates that states with more protective abortion policy have higher percentages of women with high education.

Code
# First, check sample sizes in each group
table(df$abortion_policies)

              most protective               very protective 
                            2                             7 
                   protective some restrictions/protections 
                           10                             5 
                  restrictive              very restrictive 
                            7                             4 
             most restrictive 
                           16 
Code
# Run Jonckheere-Terpstra test
jt_test <- jonckheere.test(df$percentage_women_BA_or_higher, 
                          as.numeric(df$abortion_policies), 
                          alternative="decreasing",
                          nperm = 10000)
print(jt_test)

    Jonckheere-Terpstra test

data:  
JT = 263, p-value = 1e-04
alternative hypothesis: decreasing
Code
# Permutation test
n_perm <- 10000 

# observed F-statistic
observed_anova <- aov(percentage_women_BA_or_higher ~ abortion_policies, data = df)
observed_F <- summary(observed_anova)[[1]][["F value"]][1]

set.seed(6547) 
perm_F <- numeric(n_perm)

for (i in 1:n_perm) {
  # permute the response variable
  permuted_data <- df
  permuted_data$percentage_women_BA_or_higher <- sample(permuted_data$percentage_women_BA_or_higher)
  
  # ANOVA on permuted data
  perm_anova <- aov(percentage_women_BA_or_higher ~ abortion_policies, data = permuted_data)
  perm_F[i] <- summary(perm_anova)[[1]][["F value"]][1]
}

# p-value
p_value <- mean(perm_F >= observed_F)

cat("Observed F-statistic:", observed_F, "\n")
Observed F-statistic: 4.33953 
Code
cat("Permutation Test p-value:", p_value, "\n")
Permutation Test p-value: 0.0019 
Code
# visualization
hist(perm_F, breaks = 30, main = "Permutation Distribution of F-statistic", xlab = "F-statistic")
abline(v = observed_F, col = "red", lwd = 2, lty = 2)

A permutation test was performed on this data in addition to the Kruskal-Wallis test to ensure that the findings were consistent even with different tests. The p-value that the permutation test is also statistically significant at a value of 0.0019, indicating that we have evidence to suggest that the mean percentage of women who have obtained a Bachelor’s degree or higher varies in at least one level of abortion policy.

Code
df$abortion_policies <- factor(df$abortion_policies, levels = c("most protective", "very protective", "protective", "some restrictions/protections", "restrictive", "very restrictive", "most restrictive"))

ggplot(df, aes(x = abortion_policies, y = percentage_women_BA_or_higher, fill = abortion_policies)) + stat_summary(fun = mean, geom = "bar", position = "dodge") + theme_minimal() + scale_fill_manual(values = c("#1c7416", "#68bb59", "#acdf87", "#fab733", "#ff6242", "#ff0000", "#c61a09")) + theme(axis.text.x = element_text(size = 8, angle = 45, hjust = 1), plot.title = element_text(size = 10, hjust = 0.5)) + labs(x = "Abortion Policies", y = "Mean % Women with a BA or Higher", title = "Mean Percentage of Women with a BA or Higher by Abortion Policy")
Warning: Removed 1 row containing non-finite outside the scale range
(`stat_summary()`).

Mental Wellness

Women with Postpartum Depression by Abortion Policy

Code
# Kruskal-Wallis test
kruskal.test(percent_women_with_postpartum_depression_2021 ~ abortion_policies, data = df)

    Kruskal-Wallis rank sum test

data:  percent_women_with_postpartum_depression_2021 by abortion_policies
Kruskal-Wallis chi-squared = 5.3526, df = 6, p-value = 0.4995
Code
pairwise.wilcox.test(df$percent_women_with_postpartum_depression_2021, 
                     df$abortion_policies,
                     p.adjust.method = "bonferroni")

    Pairwise comparisons using Wilcoxon rank sum test with continuity correction 

data:  df$percent_women_with_postpartum_depression_2021 and df$abortion_policies 

                              most protective very protective protective
very protective               1               -               -         
protective                    1               1               -         
some restrictions/protections 1               1               1         
restrictive                   1               1               1         
very restrictive              1               1               1         
most restrictive              1               1               1         
                              some restrictions/protections restrictive
very protective               -                             -          
protective                    -                             -          
some restrictions/protections -                             -          
restrictive                   1                             -          
very restrictive              1                             1          
most restrictive              1                             1          
                              very restrictive
very protective               -               
protective                    -               
some restrictions/protections -               
restrictive                   -               
very restrictive              -               
most restrictive              1               

P value adjustment method: bonferroni 
Code
# accompanying visualization
ggplot(df, aes(x = abortion_policies, y = percent_women_with_postpartum_depression_2021, fill=abortion_policies)) +
  geom_boxplot() +
  theme_minimal() +
  theme(axis.text.x = element_text(size = 8, angle = 45, hjust = 1),
        axis.title.y = element_text(size = 10)) + 
  scale_fill_manual(values = c("#1c7416", "#68bb59", "#acdf87", "#fab733", "#ff6242", "#ff0000", "#c61a09")) +
  labs(x = "Abortion Policies", 
       y = "% Women with Postpartum Depression",
       title = "Percentage of Women with Postpartum Depression by Abortion Policy") +
  theme(plot.title = element_text(size = 10, hjust = 0.5))

The Kruskal-Wallis test does not yield a statistically significant p-value therefore we do not have concrete evidence to suggest that there is a difference in the percentage of women with postpartum depression across the policy groups, but the accompanying visualization does offer provide insight that the highest occurrences of postpartum depression percentages are in states with most restrictive abortion policies.

Code
ggplot(df, aes(x = abortion_policies, y = percent_women_with_postpartum_depression_2021, fill = abortion_policies)) + stat_summary(fun = mean, geom = "bar", position = "dodge") + theme_minimal() + scale_fill_manual(values = c("#1c7416", "#68bb59", "#acdf87", "#fab733", "#ff6242", "#ff0000", "#c61a09")) + theme(axis.text.x = element_text(size = 8, angle = 45, hjust = 1), plot.title = element_text(size = 10, hjust = 0.5)) + labs(x = "Abortion Policies", y = "%", title = "Mean Percentage of Women With Postpartum Depression")
Warning: Removed 18 rows containing non-finite outside the scale range
(`stat_summary()`).

Code
library(dplyr)
library(ggplot2)

# Calculate the overall average
overall_avg <- mean(df$percent_women_with_postpartum_depression_2021, na.rm = TRUE)

# Filter and classify observations as Above or Below Average
df <- df %>%
  filter(!is.na(abortion_policies) & !is.na(percent_women_with_postpartum_depression_2021)) %>% 
  mutate(above_or_below = if_else(percent_women_with_postpartum_depression_2021 > overall_avg, 
                                  "Above Average", "Below Average"))

# Group and count observations
df_count <- df %>%
  group_by(abortion_policies, above_or_below) %>%
  summarize(count = n(), .groups = "drop")

# Convert counts to percentages
df_count <- df_count %>%
  group_by(abortion_policies) %>%
  mutate(percentage = count / sum(count) * 100)

# Create the stacked bar chart
ggplot(df_count, aes(x = abortion_policies, y = percentage, fill = above_or_below)) +
  geom_bar(stat = "identity", position = "stack", alpha = 0.7) +
  theme_minimal() +
  scale_fill_manual(values = c("Above Average" = "#1c7416", "Below Average" = "#ff0000")) +
  labs(x = "Abortion Policies", 
       y = "Percentage of Observations",
       fill = "Comparison to Average",
       title = "Percentage of Above/Below Average Postpartum Depression by Abortion Policy") +
  theme(axis.text.x = element_text(size = 8, angle = 45, hjust = 1),
        plot.title = element_text(size = 10, hjust = 0.5),
        axis.title.y = element_text(size = 10))

Women with Depression Before or During Pregnancy by Abortion Policy

Code
# Kruskal-Wallis test
kruskal.test(women_with_depression_before_or_during_pregnancy_2021 ~ abortion_policies, data = df)

    Kruskal-Wallis rank sum test

data:  women_with_depression_before_or_during_pregnancy_2021 by abortion_policies
Kruskal-Wallis chi-squared = 10.354, df = 6, p-value = 0.1105
Code
pairwise.wilcox.test(df$women_with_depression_before_or_during_pregnancy_2021, 
                     df$abortion_policies,
                     p.adjust.method = "bonferroni")

    Pairwise comparisons using Wilcoxon rank sum test with continuity correction 

data:  df$women_with_depression_before_or_during_pregnancy_2021 and df$abortion_policies 

                              most protective very protective protective
very protective               1.00            -               -         
protective                    1.00            1.00            -         
some restrictions/protections 1.00            1.00            1.00      
restrictive                   1.00            1.00            1.00      
very restrictive              1.00            1.00            1.00      
most restrictive              1.00            0.68            1.00      
                              some restrictions/protections restrictive
very protective               -                             -          
protective                    -                             -          
some restrictions/protections -                             -          
restrictive                   1.00                          -          
very restrictive              1.00                          1.00       
most restrictive              1.00                          1.00       
                              very restrictive
very protective               -               
protective                    -               
some restrictions/protections -               
restrictive                   -               
very restrictive              -               
most restrictive              1.00            

P value adjustment method: bonferroni 
Code
# accompanying visualization
ggplot(df, aes(x = abortion_policies, y = women_with_depression_before_or_during_pregnancy_2021, fill=abortion_policies)) +
  geom_boxplot() +
  theme_minimal() +
  theme(axis.text.x = element_text(size = 8, angle = 45, hjust = 1),
        axis.title.y = element_text(size = 10)) + 
  scale_fill_manual(values = c("#1c7416", "#68bb59", "#acdf87", "#fab733", "#ff6242", "#ff0000", "#c61a09")) +
  labs(x = "Abortion Policies", 
       y = "% Women with Depression Before or During Pregnancy",
       title = "Percentage of Women with Depression Before or During Pregnancy by Abortion Policy") +
  theme(plot.title = element_text(size = 10, hjust = 0.5))

The Kruskal Wallis test does not provide any evidence to suggest that there is an association between depression before or during pregnancy among women and the abortion policy strictness in their state.

Code
ggplot(df, aes(x = abortion_policies, y = women_with_depression_before_or_during_pregnancy_2021, fill = abortion_policies)) + stat_summary(fun = mean, geom = "bar", position = "dodge") + theme_minimal() + scale_fill_manual(values = c("#1c7416", "#68bb59", "#acdf87", "#fab733", "#ff6242", "#ff0000", "#c61a09")) + theme(axis.text.x = element_text(size = 8, angle = 45, hjust = 1), plot.title = element_text(size = 10, hjust = 0.5)) + labs(x = "Abortion Policies", y = "%", title = "Mean Percentage of Women With Depression Before or During Pregnancy by Abortion Policy")

Percentage of Women Who Received a Postpartum Depression Screening in 2021, by Abortion Policy

Code
# Kruskal-Wallis test
kruskal.test(percent_women_who_received_a_postpartum_depression_screening_2021 ~ abortion_policies, data = df)

    Kruskal-Wallis rank sum test

data:  percent_women_who_received_a_postpartum_depression_screening_2021 by abortion_policies
Kruskal-Wallis chi-squared = 15.405, df = 6, p-value = 0.01733
Code
pairwise.wilcox.test(df$percent_women_who_received_a_postpartum_depression_screening_2021, 
                     df$abortion_policies,
                     p.adjust.method = "bonferroni")

    Pairwise comparisons using Wilcoxon rank sum test with continuity correction 

data:  df$percent_women_who_received_a_postpartum_depression_screening_2021 and df$abortion_policies 

                              most protective very protective protective
very protective               1.000           -               -         
protective                    1.000           1.000           -         
some restrictions/protections 1.000           1.000           1.000     
restrictive                   1.000           1.000           1.000     
very restrictive              1.000           1.000           1.000     
most restrictive              1.000           1.000           0.088     
                              some restrictions/protections restrictive
very protective               -                             -          
protective                    -                             -          
some restrictions/protections -                             -          
restrictive                   1.000                         -          
very restrictive              1.000                         1.000      
most restrictive              1.000                         0.345      
                              very restrictive
very protective               -               
protective                    -               
some restrictions/protections -               
restrictive                   -               
very restrictive              -               
most restrictive              1.000           

P value adjustment method: bonferroni 
Code
# accompanying visualization
ggplot(df, aes(x = abortion_policies, y = percent_women_who_received_a_postpartum_depression_screening_2021, fill=abortion_policies)) +
  geom_boxplot() +
  theme_minimal() +
  theme(axis.text.x = element_text(size = 8, angle = 45, hjust = 1),
        axis.title.y = element_text(size = 10)) + 
  scale_fill_manual(values = c("#1c7416", "#68bb59", "#acdf87", "#fab733", "#ff6242", "#ff0000", "#c61a09")) +
  labs(x = "Abortion Policies", 
       y = "% Women Receiving Postpartum Depression Screening",
       title = "Percentage of Women who Received Postpartum Depression Screening by Abortion Policy") +
  theme(plot.title = element_text(size = 8, hjust = 0.5))

Code
# Run Jonckheere-Terpstra test
jt_test <- jonckheere.test(df$percent_women_who_received_a_postpartum_depression_screening_2021, 
                          as.numeric(df$abortion_policies), 
                          alternative="decreasing",
                          nperm = 10000)
print(jt_test)

    Jonckheere-Terpstra test

data:  
JT = 105, p-value = 3e-04
alternative hypothesis: decreasing
Code
# Permutation test
n_perm <- 10000 

# observed F-statistic
observed_anova <- aov(percent_women_who_received_a_postpartum_depression_screening_2021 ~ abortion_policies, data = df)
observed_F <- summary(observed_anova)[[1]][["F value"]][1]

set.seed(6547) 
perm_F <- numeric(n_perm)

for (i in 1:n_perm) {
  # Permute the response variable
  permuted_data <- df
  permuted_data$percent_women_who_received_a_postpartum_depression_screening_2021 <- sample(permuted_data$percent_women_who_received_a_postpartum_depression_screening_2021)
  
  # Perform ANOVA on permuted data
  perm_anova <- aov(percent_women_who_received_a_postpartum_depression_screening_2021 ~ abortion_policies, data = permuted_data)
  perm_F[i] <- summary(perm_anova)[[1]][["F value"]][1]
}

# Calculate p-value
p_value <- mean(perm_F >= observed_F)

# Print results
cat("Observed F-statistic:", observed_F, "\n")
Observed F-statistic: 4.212749 
Code
cat("Permutation Test p-value:", p_value, "\n")
Permutation Test p-value: 0.0091 
Code
# Visualize the permutation distribution
hist(perm_F, breaks = 30, main = "Permutation Distribution of F-statistic", xlab = "F-statistic")
abline(v = observed_F, col = "red", lwd = 2, lty = 2)

The Kruskal-Wallis test indicates that we have significant evidence at a p-value of 0.01733 that there is an association between the percentage of women screened for postpartum depression in a state and the state’s abortion policy. The boxplot visualization reveals that postpartum depression screening is the highest in states with the most protective abortion policies, where approximately 94% of women are screened. The lowest percentage of women screened for postpartum depression is in the states that are the most restrictive. This suggests that states that prioritize screening for mothers’ mental health are also the most considerate of abortion rights, whereas states that chose to implement more restrictive abortion policies already do not prioritize widespread screening for mothers’ postpartum depression.

Code
ggplot(df, aes(x = abortion_policies, y = percent_women_who_received_a_postpartum_depression_screening_2021, fill = abortion_policies)) + stat_summary(fun = mean, geom = "bar", position = "dodge") + theme_minimal() + scale_fill_manual(values = c("#1c7416", "#68bb59", "#acdf87", "#fab733", "#ff6242", "#ff0000", "#c61a09")) + theme(axis.text.x = element_text(size = 8, angle = 45, hjust = 1), plot.title = element_text(size = 10, hjust = 0.5)) + labs(x = "Abortion Policies", y = "%", title = "Mean Percentage of Women Recieved Postpartum Depression Screening")

Percentage of Women Experiencing Intimate Partner Violence by Abortion Policy

Code
# Kruskal-Wallis test
kruskal.test(percent_women_experiencing_intimate_partner_violence_2021 ~ abortion_policies, data = df)

    Kruskal-Wallis rank sum test

data:  percent_women_experiencing_intimate_partner_violence_2021 by abortion_policies
Kruskal-Wallis chi-squared = 10.88, df = 6, p-value = 0.09215
Code
pairwise.wilcox.test(df$percent_women_experiencing_intimate_partner_violence_2021, 
                     df$abortion_policies,
                     p.adjust.method = "bonferroni")

    Pairwise comparisons using Wilcoxon rank sum test with continuity correction 

data:  df$percent_women_experiencing_intimate_partner_violence_2021 and df$abortion_policies 

                              most protective very protective protective
very protective               1.00            -               -         
protective                    1.00            1.00            -         
some restrictions/protections 1.00            1.00            1.00      
restrictive                   1.00            1.00            1.00      
very restrictive              1.00            1.00            1.00      
most restrictive              1.00            0.58            0.84      
                              some restrictions/protections restrictive
very protective               -                             -          
protective                    -                             -          
some restrictions/protections -                             -          
restrictive                   1.00                          -          
very restrictive              1.00                          1.00       
most restrictive              1.00                          1.00       
                              very restrictive
very protective               -               
protective                    -               
some restrictions/protections -               
restrictive                   -               
very restrictive              -               
most restrictive              1.00            

P value adjustment method: bonferroni 
Code
# accompanying visualization
ggplot(df, aes(x = abortion_policies, y = percent_women_experiencing_intimate_partner_violence_2021, fill=abortion_policies)) +
  geom_boxplot() +
  theme_minimal() +
  theme(axis.text.x = element_text(size = 8, angle = 45, hjust = 1),
        axis.title.y = element_text(size = 10)) + 
  scale_fill_manual(values = c("#1c7416", "#68bb59", "#acdf87", "#fab733", "#ff6242", "#ff0000", "#c61a09")) +
  labs(x = "Abortion Policies", 
       y = "% Women Experiencing Domestic Violence",
       title = "Percentage of Women Experiencing Intimate Partner Violence by Abortion Policy") +
  theme(plot.title = element_text(size = 10, hjust = 0.5))

The Kruskal-Wallis Test and the boxplot visualization to not provide any evidence to suggest that there is a relationship between the prevalence of women experiencing domestic violence and the state’s abortion policy.

Code
# Run Jonckheere-Terpstra test
jt_test <- jonckheere.test(df$percent_women_experiencing_intimate_partner_violence_2021, 
                          as.numeric(df$abortion_policies), 
                          alternative="increasing",
                          nperm = 10000)
print(jt_test)

    Jonckheere-Terpstra test

data:  
JT = 298.5, p-value = 0.0077
alternative hypothesis: increasing

Travel

Code
install.packages("usmap")
also installing the dependency 'usmapdata'

The downloaded binary packages are in
    /var/folders/zl/mwx636ts1pbc006v4wqp6fwr0000gn/T//RtmppaDaK2/downloaded_packages
Code
library(usmap)
Warning: package 'usmap' was built under R version 4.3.2
Code
library()
abortion_map_data <- df %>%
  select(State, `percent_increase_in_clinician_provided_abortions_since_2020`) %>%
  filter(!is.na(`percent_increase_in_clinician_provided_abortions_since_2020`)) %>%
  mutate(
    state = State,
    change = `percent_increase_in_clinician_provided_abortions_since_2020`
  ) 

print("First few rows of our data:")
[1] "First few rows of our data:"
Code
head(abortion_map_data)
# A tibble: 6 × 4
  State                percent_increase_in_clinician_provided_abo…¹ state change
  <chr>                                                       <dbl> <chr>  <dbl>
1 Colorado                                                     96.1 Colo…   96.1
2 Delaware                                                     92.2 Dela…   92.2
3 District of Columbia                                          2.7 Dist…    2.7
4 Georgia                                                      24.3 Geor…   24.3
5 Hawaii                                                       27.3 Hawa…   27.3
6 Illinois                                                     71.6 Illi…   71.6
# ℹ abbreviated name:
#   ¹​percent_increase_in_clinician_provided_abortions_since_2020
Code
plot_usmap(data = abortion_map_data, values = "change") +
  scale_fill_gradient2(
    name = "% Change",
    low = "#ff6b6b",   
    mid = "#ffffff",    
    high = "#4dbd00",  
    midpoint = 0,
    na.value = "gray80"  
  ) +
  theme(
    legend.position = "right",
    plot.title = element_text(size = 14, hjust = 0.5)
  ) +
  labs(title = "Change in Clinician-Provided Abortions Since 2020 by State")

Code
install.packages(c("usmap", "ggplot2"))

The downloaded binary packages are in
    /var/folders/zl/mwx636ts1pbc006v4wqp6fwr0000gn/T//RtmppaDaK2/downloaded_packages
Code
library(usmap)
library(ggplot2)
library(dplyr)
library(sf)
Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
Code
# Create data frame with state capitals for the specified states
capitals_df <- data.frame(
  lon = c(-116.200, -100.346, -93.603, -92.189, -92.289,
          -91.187, -97.743, -97.503, -90.182, -86.279,
          -84.280, -86.781, -84.878, -86.162, -81.633),
  lat = c(43.617, 44.367, 41.591, 38.579, 34.746,
          30.457, 30.266, 35.482, 32.298, 32.377,
          30.438, 36.162, 38.186, 39.768, 38.336),
  state = c("Idaho", "South Dakota", "Iowa", "Missouri", "Arkansas",
            "Louisiana", "Texas", "Oklahoma", "Mississippi", "Alabama",
            "Florida", "Tennessee", "Kentucky", "Indiana", "West Virginia")
)


# Plot the map with stars for capitals
plot_usmap(data = abortion_map_data, values = "percent_increase_in_clinician_provided_abortions_since_2020") +
  geom_point(data = capitals_df, 
             aes(x = lon, y = lat), # Use longitude and latitude for capitals
             shape = 8,         # Star shape
             size = 4,          # Adjust size
             color = "black",   # Star color
             stroke = 1.5) +    # Thickness of the outline
  scale_fill_gradient2(
    name = "% Change",
    low = "#ff6b6b",   
    mid = "#ffffff",    
    high = "#4dbd00",  
    midpoint = 0,
    na.value = "gray80"  
  ) +
  theme_minimal() +
  theme(
    legend.position = "right",
    plot.title = element_text(size = 14, hjust = 0.5)
  ) +
  labs(title = "Change in Clinician-Provided Abortions Since 2020 by State",
       caption = "*Stars indicate state capitals*")